110 research outputs found

    A resource-frugal probabilistic dictionary and applications in (meta)genomics

    Get PDF
    Genomic and metagenomic fields, generating huge sets of short genomic sequences, brought their own share of high performance problems. To extract relevant pieces of information from the huge data sets generated by current sequencing techniques, one must rely on extremely scalable methods and solutions. Indexing billions of objects is a task considered too expensive while being a fundamental need in this field. In this paper we propose a straightforward indexing structure that scales to billions of element and we propose two direct applications in genomics and metagenomics. We show that our proposal solves problem instances for which no other known solution scales-up. We believe that many tools and applications could benefit from either the fundamental data structure we provide or from the applications developed from this structure.Comment: Submitted to PSC 201

    Some considerations for analyzing biodiversity using integrative metagenomics and gene networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Improving knowledge of biodiversity will benefit conservation biology, enhance bioremediation studies, and could lead to new medical treatments. However there is no standard approach to estimate and to compare the diversity of different environments, or to study its past, and possibly, future evolution.</p> <p>Presentation of the hypothesis</p> <p>We argue that there are two conditions for significant progress in the identification and quantification of biodiversity. First, integrative metagenomic studies - aiming at the simultaneous examination (or even better at the integration) of observations about the elements, functions and evolutionary processes captured by the massive sequencing of multiple markers - should be preferred over DNA barcoding projects and over metagenomic projects based on a single marker. Second, such metagenomic data should be studied with novel inclusive network-based approaches, designed to draw inferences both on the many units and on the many processes present in the environments.</p> <p>Testing the hypothesis</p> <p>We reached these conclusions through a comparison of the theoretical foundations of two molecular approaches seeking to assess biodiversity: metagenomics (mostly used on prokaryotes and protists) and DNA barcoding (mostly used on multicellular eukaryotes), and by pragmatic considerations of the issues caused by the 'species problem' in biodiversity studies.</p> <p>Implications of the hypothesis</p> <p>Evolutionary gene networks reduce the risk of producing biodiversity estimates with limited explanatory power, biased either by unequal rates of LGT, or difficult to interpret due to (practical) problems caused by type I and type II grey zones. Moreover, these networks would easily accommodate additional (meta)transcriptomic and (meta)proteomic data.</p> <p>Reviewers</p> <p>This article was reviewed by Pr. William Martin, Dr. David Williams (nominated by Pr. J Peter Gogarten) & Dr. James McInerney (nominated by Pr. John Logsdon).</p

    Minimal perfect hash functions in large scale bioinformatics Problem

    Get PDF
    International audience. Genomic and metagenomic fields, generating huge sets ofshort genomic sequences, brought their own share of high performanceproblems. To extract relevant pieces of information from the huge datasets generated by current sequencing techniques, one must rely on extremelyscalable methods and solutions. Indexing billions of objects isa task considered too expensive while being a fundamental need in thisfield. In this paper we propose a straightforward indexing structure thatscales to billions of element and we propose two direct applications ingenomics and metagenomics. We show that our proposal solves probleminstances for which no other known solution scales-up. We believe thatmany tools and applications could benefit from either the fundamentaldata structure we provide or from the applications developed from thisstructure

    Testing ecological theories with sequence similarity networks: marine ciliates exhibit similar geographic dispersal patterns as multicellular organisms

    Get PDF
    International audienceBackground : High-throughput sequencing technologies are lifting major limitations to molecular-based ecological studies of eukaryotic microbial diversity, but analyses of the resulting millions of short sequences remain a major bottleneck for these approaches. Here, we introduce the analytical and statistical framework of sequence similarity networks, increasingly used in evolutionary studies and graph theory, into the field of ecology to analyze novel pyrosequenced V4 small subunit rDNA (SSU-rDNA) sequence data sets in the context of previous studies, including SSU-rDNA Sanger sequence data from cultured ciliates and from previous environmental diversity inventories.Results : Our broadly applicable protocol quantified the progress in the description of genetic diversity of ciliates by environmental SSU-rDNA surveys, detected a fundamental historical bias in the tendency to recover already known groups in these surveys, and revealed substantial amounts of hidden microbial diversity. Moreover, network measures demonstrated that ciliates are not globally dispersed, but are structured by habitat and geographical location at intermediate geographical scale, as observed for bacteria, plants, and animals.Conclusions : Currently available ‘universal’ primers used for local in-depth sequencing surveys provide little hope to exhaust the significantly higher ciliate (and most likely microbial) diversity than previously thought. Network analyses such as presented in this study offer a promising way to guide the design of novel primers and to further explore this vast and structured microbial diversity

    Time-Dependent Internalization of Polymer-Coated Silica Nanoparticles in Brain Endothelial Cells and Morphological and Functional Effects on the Blood-Brain Barrier

    Get PDF
    Nanoparticle (NP)-assisted procedures including laser tissue soldering (LTS) offer advantages compared to conventional microsuturing, especially in the brain. In this study, effects of polymer-coated silica NPs used in LTS were investigated in human brain endothelial cells (ECs) and blood-brain barrier models. In the co-culture setting with ECs and pericytes, only the cell type directly exposed to NPs displayed a time-dependent internalization. No transfer of NPs between the two cell types was observed. Cell viability was decreased relatively to NP exposure duration and concentration. Protein expression of the nuclear factor k-light-chain-enhancer of activated B cells and various endothelial adhesion molecules indicated no initiation of inflammation or activation of ECs after NP exposure. Differentiation of CD34+ ECs into brain-like ECs co-cultured with pericytes, blood-brain barrier (BBB) characteristics were obtained. The established endothelial layer reduced the passage of integrity tracer molecules. NP exposure did not result in alterations of junctional proteins, BBB formation or its integrity. In a 3-dimensional setup with an endothelial tube formation and tight junctions, barrier formation was not disrupted by the NPs and NPs do not seem to cross the blood-brain barrier. Our findings suggest that these polymer-coated silica NPs do not damage the BBB

    Шероховатость поверхностей при финишной алмазно-абразивной обработке

    Get PDF
    Показано, что шероховатость полированной поверхности зависит от отношения частот собственных колебаний молекулярных фрагментов на поверхностях инструмента и обрабатываемой детали. На шероховатость обработанной поверхности наибольшее влияние оказывают число молекулярных фрагментов, из которых состоят частицы шлама, их наиболее вероятный размер, частоты собственных колебаний фрагментов обрабатываемого материала и инструмента, теплопроводность обрабатываемого материала и режимы обработки

    The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy

    Get PDF
    International audienceThe interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total, 136 866 sequences are nuclear encoded, 45 708 (36 501 mitochondrial and 9657 chloroplastic) are from organelles, the remaining being putative chimeric sequences. The website allows the users to download sequences from the entire and partial databases (including representative sequences after clustering at a given level of similarity). Different web tools also allow searches by sequence similarity. The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientists

    The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy

    Get PDF
    The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total, 136 866 sequences are nuclear encoded, 45 708 (36 501 mitochondrial and 9657 chloroplastic) are from organelles, the remaining being putative chimeric sequences. The website allows the users to download sequences from the entire and partial databases (including representative sequences after clustering at a given level of similarity). Different web tools also allow searches by sequence similarity. The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientist

    Plankton networks driving carbon export in the oligotrophic ocean

    Get PDF
    The biological carbon pump is the process by which CO 2 is transformed to organic carbon via photosynthesis, exported through sinking particles, and finally sequestered in the deep ocean. While the intensity of the pump correlates with plankton community composition, the underlying ecosystem structure driving the process remains largely uncharacterized. Here we use environmental and metagenomic data gathered during the Tara Oceans expedition to improve our understanding of carbon export in the oligotrophic ocean. We show that specific plankton communities, from the surface and deep chlorophyll maximum, correlate with carbon export at 150 m and highlight unexpected taxa such as Radiolaria and alveolate parasites, as well as Synechococcus and their phages, as lineages most strongly associated with carbon export in the subtropical, nutrient-depleted, oligotrophic ocean. Additionally, we show that the relative abundance of a few bacterial and viral genes can predict a significant fraction of the variability in carbon export in these regions
    corecore